Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Stomach cancer image segmentation method based on EfficientNetV2 and object-contextual representation
Di ZHOU, Zili ZHANG, Jia CHEN, Xinrong HU, Ruhan HE, Jun ZHANG
Journal of Computer Applications    2023, 43 (9): 2955-2962.   DOI: 10.11772/j.issn.1001-9081.2022081159
Abstract379)   HTML19)    PDF (4902KB)(202)       Save

In view of the problems that the upsampling process of U-Net is easy to lose details, and the datasets of stomach cancer pathological image are generally small, which tends to lead to over-fitting, an automatic segmentation model for pathological images of stomach cancer based on improved U-Net was proposed, namely EOU-Net. In EOU-Net, based on the existing U-Net model, EfficientNetV2 was used as the backbone, thereby enhancing the feature extraction ability of the network encoder. In the decoding stage, the relations between cell pixels were explored on the basis of Object-Contextual Representation (OCR), and the improved OCR module was used to solve the loss problem of the upsampled image details. Then, the post-processing of Test Time Augmentation (TTA) was used to predict the images obtained by rollover and rotations at different angles of the input image respectively, and then the prediction results of these images were combined by feature fusion to further optimize the output results of the network, thereby solving the problem of small medical datasets effectively. Experimental results on datasets SEED, BOT and PASCAL VOC 2012 show that the Mean Intersection over Union (MIoU) of EOU-Net is improved by 1.8, 0.6 and 4.5 percentage points respectively compared with that of OCRNet. It can be seen that EOU-Net can obtain more accurate segmentation results of stomach cancer images.

Table and Figures | Reference | Related Articles | Metrics
Pedestrian trajectory prediction based on multi-head soft attention graph convolutional network
Tao PENG, Yalong KANG, Feng YU, Zili ZHANG, Junping LIU, Xinrong HU, Ruhan HE, Li LI
Journal of Computer Applications    2023, 43 (3): 736-743.   DOI: 10.11772/j.issn.1001-9081.2022020207
Abstract359)   HTML15)    PDF (5673KB)(172)    PDF(mobile) (2752KB)(31)    Save

The complexity of pedestrian interaction is a challenge for pedestrian trajectory prediction, and the existing algorithms are difficult to capture meaningful interaction information between pedestrians, which cannot intuitively model the interaction between pedestrians. To address this problem, a multi-head soft attention graph convolutional network was proposed. Firstly, a Multi-head Soft ATTention (MS ATT) combined with involution network was used to extract sparse spatial adjacency matrix and sparse temporal adjacency matrix from spatial and temporal graph inputs respectively to generate sparse spatial directed graph and sparse temporal directed graph. Then, a Graph Convolutional Network (GCN) was used to learn interaction and motion trend features from sparse spatial and sparse temporal directed graphs. Finally, the learned trajectory features were input into a Temporal Convolutional Network (TCN) to predict double Gaussian distribution parameters, thereby generating the predicted pedestrian trajectories. Experiments on Eidgenossische Technische Hochschule (ETH) and University of CYprus (UCY) datasets show that, compared with Space-time sOcial relationship pooling pedestrian trajectory Prediction Model (SOPM), the proposed algorithm reduces the Average Displacement Error (ADE) by 2.78%, and compared to Sparse Graph Convolution Network (SGCN), the proposed algorithm reduces the Final Displacement Error (FDE) by 16.92%.

Table and Figures | Reference | Related Articles | Metrics
Cascaded cross-domain feature fusion for virtual try-on
Xinrong HU, Junyu ZHANG, Tao PENG, Junping LIU, Ruhan HE, Kai HE
Journal of Computer Applications    2022, 42 (4): 1269-1274.   DOI: 10.11772/j.issn.1001-9081.2021071274
Abstract240)   HTML5)    PDF (1058KB)(74)       Save

The virtual try-on technologies based on image synthesis mask strategy can better retain details of the clothing when the warped clothing is fused with the human body. However, because the position and structure of the human body and the clothing are difficult to align during the try-on process, the try-on result is likely to produce severe occlusion, affecting visual effect. In order to solve the occlusion in the try-on process, a U-Net based generator was proposed. In the generator, a cascaded spatial attention module and a channel attention module were added to the U-Net decoder, thereby achieving the cross-domain fusion between local features of warped clothes and global features of the human body. Formally, first, by predicting the Thin Plate Spline (TPS) conversion using the convolutional network, the clothing was distorted according to the target human body pose. Then, the dressed-on person representation information and the warped clothing were input into the proposed generator, and the mask image of the corresponding clothing area was obtained to render the intermediate result. Finally, the strategy of mask synthesis was used to synthesize the warped clothing with the intermediate result through mask processing to obtain the final try-on result. Experimental results show that the proposed method can not only reduce occlusion, but also enhance image details. Compared with Characteristic-Preserving Virtual Try-On Network (CP-VTON) method, the proposed method has the generated image with the average Peak Signal-to-Noise Ratio (PSNR) increased by 10.47%, the average Fréchet Inception Distance (FID) decreased by 47.28%, and the average Structural SIMilarity (SSIM) increased by 4.16%.

Table and Figures | Reference | Related Articles | Metrics
Knowledge graph embedding model based on improved Inception structure
Xiaopeng YU, Ruhan HE, Jin HUANG, Junjie ZHANG, Xinrong HU
Journal of Computer Applications    2022, 42 (4): 1065-1071.   DOI: 10.11772/j.issn.1001-9081.2021071265
Abstract475)   HTML29)    PDF (570KB)(167)       Save

KGE(Knowledge Graph Embedding) maps entities and relationships into a low-dimensional continuous vector space, uses machine learning methods to implement relational data applications, such as knowledge analysis, reasoning, and completion. Taking ConvE (Convolution Embedding) as a representative, CNN (Convolutional Neural Network) is applied to knowledge graph embedding to capture the interactive information of entities and relationships, but the ability of the standard convolutional to capture feature interaction information is insufficient, and its feature expression ability is low. Aiming at the problem of insufficient feature interaction ability, an improved Inception structure was proposed, based on which a knowledge graph embedding model named InceE was constructed. Firstly, hybrid dilated convolution replaced standard convolution to improve the ability to capture feature interaction information. Secondly, the residual network structure was used to reduce the loss of feature information. The experiments were carried out on the datasets Kinship, FB15k, WN18 to verify the effectiveness of link prediction by InceE. Compared with ArcE and QuatRE models on the Kinship and FB15k datasets, the Hit@1 of InceE increased by 1.6 and 1.5 percentage points; compared with ConvE on the three datasets, the Hit@1 of InceE increased by 6.3, 20.8, and 1.0 percentage points. The experimental results show that InceE has a stronger ability to capture feature interactive information.

Table and Figures | Reference | Related Articles | Metrics